home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Tech Arsenal 1
/
Tech Arsenal (Arsenal Computer).ISO
/
tek-04
/
love4th.zip
/
SEGMENT.DOC
< prev
next >
Wrap
Text File
|
1991-10-01
|
15KB
|
333 lines
LOVE Forth addressing and segmentation
--------------------------------------
Almost all languages have problems running on the 8086/88, but
these problems for FORTH are especially severe. Most FORTH systems on
this architecture are restricted to 64K of main memory for program and
data, and are referred to as small memory models. This restricts the
user and programs to a small amount of memory, but offers the highest
possible execution speed. 32 bit FORTHS have been produced that offer
a large address space, but performance has been severely degraded. The
segmentation approach taken in LOVE FORTH offers both a large memory
size ( 320K ) and very fast execution speed.
Rather than offer a large contiguous memory space, LOVE Forth
has divided up the forth model by function. There are separate
segments for machine code, threaded addresses, data, stacks and
dictionary headers. As the source code is compiled it is parcelled
into these five segments. There is no execution time penalty over
Forths with the small memory model.
Note that this implementation is quite compatible with standard
16 bit models. For example, @ (fetch) and ! (store) access the data
segment (the vast majority of FORTH programs use @ and ! to access
data). Another example is that the assembler always puts its code into
the code segment. The programmer need not worry that the code has been
separated from the rest of the program. Even though segmentation is
provided for in a logical fashion, some compiler words must be
implemented differently than in standard FORTH.
There are numerous indirect benefits to this segmentation, over
and above that of memory conservation. Target systems can easily be
saved without heads (the head segment is simply not written to disk).
The segments can be compressed to provide small target systems. And
because machine code is separated from threads, it is actually possible
to save space in the thread segment by re-coding some words in machine
code. (The thread segment always fills the fastest). This gives
simultaneously a speed and size advantage.
Note that this conforms closely to the intended usage of the
architecture of 8086/88 microprocessors. The ususal programming battle
with these processors is to overcome this limited architecture.
Here is a summary of the contents of each segment:
Segment Description Name
======= =========== ====
CODE Contains 8086 machine code CS:
pointed to by CS register
THREAD Contains threaded address lists generated by TS:
high level words.
The code field address points here.
pointed to by DS register
DATA holds data from variables, alphanumeric strings, VS:
and block buffers.
pointed to by ES register
HEAD holds the compile-time word headers, and HS:
vocabulary links.
(segment value calculated when req'd)
STACK holds the parameter, return and vocabulary SS:
stacks and local variables, if used.
pointed to by SS register
Each segment has a corresponding dictionary pointer, and a set
of basic manipulation words such as CS:@ or HS:, . Note that all the
addresses within these segments are 16 bits. The programmer must
specify the segment to be operated upon by the type of operator used
(eg. @, TS:@, CS:@ etc.)
As MS-DOS tends to vary the position in RAM at which a program
is loaded, each segment also has a word to return the actual position of
the segment (GET:CS, GET:SS etc.). The handy command MEM-MAP displays
all the segments, and their respective dictionary pointers.
In this documentation and elsewhere, addresses are abbreviated.
For example TS:addr represents an address in the thread segment.
Simply 'addr' refers to the the variable segment (most often used).
Some names assume a segment, for example 'compilation address' is always
in the thread segment, name field address is always in the head segment.
CODE SEGMENT
------------
This is the only segment that contains 8086/8088 machine code.
Apart from the space taken by a few pointers used in CREATE DOES>
words, this allows code to reach a full 64K. The assembler places the
definition body into the code segment automatically.
This is always the lowest of the 5 segments. Startup code in
this segment, sets up the other segments. This segment contains the
MS-DOS "PSP" (program segment prefix) in the first 256 bytes, in
version 1.28 and prior ones. Use !!GET:PSP! in newer versions.
Basic operators:
CS:C@ CS:@ CS:! CS:C! CS:, CS:C, CS:HERE
These are analogous to the standard words: C@ @ ! C! , C, and
HERE, but operate on the code segment.
'CODE operates like ' but returns the address of the executed
code extracted from the compilation address. For example all :
words return the same value from 'CODE because they all call
the common code for nesting colon definitions. It is thus most
useful with CODE words, where it returns the address of the
code loaded by the assembler.
CS:DUMP is a utility that allows bytes to be dumped from this
segment. ( CS:addr, #bytes -- )
There are also some system 'variables' which are used, for
example, at start-up before all the segments have been loaded or
properly positioned.
TOPSEG STACKSIZE TOPSEG SEGPAK LOVEF
CSEG TSEG VSEG HSEG SSEG
The current segment (position in RAM) is returned by:
GET:CS (8086 CS register)
THREAD SEGMENT
--------------
Forth high-level (:) words are compiled into a sequence of 16
bit addresses, called threads. This segment contains these threads,
CONSTANT and LITERAL values, and pointers to data and code. In the
majority of applications this segment fills up the fastest.
Basic operators:
TS:@ TS:! TS:, TS:HERE
Note that there are no single byte operators - all elements in
this segment are two bytes.
EXECUTE ( TS:addr -- )
Accepts the code field address.
TS:DUMP ( TS:addr, #bytes -- )
Dumps bytes from the specified address.
Many words with compile-time usage accept or return addresses in
this segment:
' ['] -FIND ( -- TS:addr )
FIND ( VS:addr -- VS:addr, 0 or TS:addr, n )
Words created with the following return a thread segment address
at run-time:
CREATE: (alone) or CREATE: DOES:> ( pair)
The most often used words for creation are CREATE and
CREATE DOES> (pair). See the Variable segment (below).
In addition the following words add to this segment and have
functions as expected:
COMPILE [COMPILE] wordname LITERAL DLITERAL
See also the technical note on L.O.V.E. Forth compatibility for
examples of compile-time word usage.
TS:BODY> TS:>BODY ( TS: addr -- TS: addr )
are like >BODY and BODY> but operate on the thread segment
only. (see discussion of 'Field access operators' below)
>BODY ( TS:addr -- VS:addr )
operates in LOVE Forth to accept a code field address of a
VARIABLE (or word created by CREATE) and return the data field
address.
>LINK >NAME ( TS:addr -- HS:addr )
are used to access the dictionary header of the specified word.
If TS:addr is not a valid code field address, an error message
is displayed.
NAME> LINK> ( HS:addr -- TS:addr )
are used to find the compilation address from the head address.
FIND-1VOC FIND-VOCS
( addr, addr -- TS:addr,true or false)
are used by FIND - address of word to find (usually at
HERE) and vocab body input and cfa output (if found).
GET:TS - returns current segment value (8086 DS register)
VARIABLE (DATA) SEGMENT
-----------------------
This segment is accessed the most often by application programs.
This contains the data for variables, alphanumeric strings compiled by
." and " , BLOCK buffers, text input buffer (TIB), PAD, HERE and where
space is allocated for programmer defined data structures. Most
standard Forth memory access words work relative to this segment.
Basic operators:
@ ! C@ C! C, , D@ D! +! +C! TYPE ALLOT
TOGGLE BMOVE CMOVE CMOVE> FILL
ENCLOSE EXPECT COUNT TYPE -TRAILING
CONVERT NUMBER #>
HERE PAD WORD
BLOCK LIMIT FIRST BUFFER +BUF R/W
Various I/O words:
L->CRT N$ N$. 'STREAM
TIB, HLD and other VARIABLEs all return addresses in VS:
File name strings passed into DOS words:
<OPEN> OPEN <CREATE> FCREATE INQUIRE <CREATE-NEW> CREATE-NEW
DELETE RENAME CHDIR
Other DOS words:
READ WRITE ENV-SRCH DIR-GET ASCIIZ. ASCIIZ"
-words created by VARIABLE DVARIABLE CREATE
CREATE ... DOES>
DUMP ( addr, #bytes-- ) Dumps the specified bytes.
GET:VS - returns current segment value (8086 ES register)
HEAD SEGMENT
------------
The head segment is normally used during compilation only. It
contains the header part of a Forth word definition, including name,
dictionary links and pointers to the locations of the word in other
segements. This segment may be discarded when creating a stand-alone
application program. Utilities such as WORDS and FORGET access this
segment automatically.
Basic operators:
HS:@ HS:! HS:C@ HS:C! HS:, HS:C, TRAVERSE
N>LINK L>NAME LINK> NAME>
.ID LAST
HS:HERE returns the next available address in this segment.
GET:HS - returns current head segment value (calculated)
Note TOGGLE does not act on HS: (often used to toggle header
bits)
Note: the form of the head segment is subject to change in future
versions by the authors without prior notice.
STACK SEGMENT
-------------
This segment holds the Forth parameter, return, vocabulary and
local variables stacks. The operation of words on this segment is
transparent to the programmer. During development, allowing a full 64K
to the stack segment means that system crashes due to stack overflow are
minimized.
Basic operators: SS:@ SS:! .S
SP@ RP@ LP@ ( -- SS:addr )
These words return stack limits or current positions
S0
is a variable that contains the address of the bottom of stack
SS:HERE
Is the dictonary pointer in this segment, but is currently
unused by any words in L.O.V.E. Forth and may be used by
the programmer if so desired.
GET:SS - returns current segment value (8086 SS register)
Field Access Operators
======================
Every word in Forth has a number of parts or fields. These
include the name, link, code and parameter fields. Field access
operators are used to gain access to the various portions of forth
words. In L.O.V.E. Forth, as the parts of words are parcelled between
segments, many of these operators accept an address in one segment and
deliver an address in another. Here is a summary of the standard
field access operators and their functions in LOVE Forth.
>BODY ( TS:addr -- addr )
accepts a code field address of a VARIABLE (or word created by
CREATE) and returns the data field address (in VS:) .
TS:>BODY TS:BODY> ( TS: addr -- TS: addr )
are like >BODY and BODY> but operate on the thread segment
only. Given the compilation address, TS:>BODY returns the
address of the first threaded address (of a : definition), the
data field of a CONSTANT, or the address pointer of a VARIABLE.
Note that there are thus two types of >BODY. >BODY could be
rewritten:
: >BODY TS:>BODY TS:@ ;
>LINK >NAME ( TS:addr -- HS:addr )
are used to access the dictionary header of the specified word.
If TS:addr is not a valid compilation address, an error message
is displayed and execution is ABORTed.
NAME> LINK> ( HS:addr -- TS:addr )
are used to find the compilation address from the header
addresses name and link fields.
N>LINK L>NAME ( HS:addr -- HS:addr )
are used to move between the name and link fields which are
both in the head segment.
Note that there is no word BODY> to move from the VS: parameter
address of a VARIABLE or CREATEd word to the compilation
address. This is not supported in L.O.V.E. Forth.
Long Operators
==============
LOVE Forth contains a set of basic operators which operate on
any area of memory. These words allow the specification of both the
segment and address of the word to be operated upon.
Basic operators: @L !L C@L C!L BMOVEL
Some disk operators will operate on any segment:
READL WRITEL <READL> <WRITEL>
RWTSL EXEC
ENV-SRCH ( string -- seg, addr, f or t )
returns both segment and address of DOS environment
DUMPL ( seg,addr,#bytes -- )
Allows memory to be dumped relative to any segment.
Memory map
----------
The dictionary pointers move up as more is compiled. Certain
words only use certain segments (eg. a CONSTANT occupies only the thread
and head segments). When any of the dictionary pointers reaches within
400 bytes of the maximum available address a warning message is
displayed 'GETTING CLOSE TO FULL'.
The maximum available address in each segment is dependent on
several things. Virtual vocabularies are loaded in high memory, disk
buffers are also here (in the VS: only - minimum of 2k bytes). The
current maximum addresses are always stored in the VARIABLE TOPS
(contains one cell for each of CS: TS: VS: and HS:). If the program
is very large, it is best to remove any resident virtual vocabulary
with FORGET-SYS.